AUTOMATIC BILINGUAL TERMINOLOGY EXTRACTION A Practical Approach
نویسنده
چکیده
Faced with large and steadily increasing work volumes, the Patent Cooperation Treaty Translation Section at the World Intellectual Property Organization, Geneva, is looking for ways to improve the efficiency of its translation process. A terminology problem has been identified, and attention has turned to automatic bilingual terminology extraction as a possible means of solving that problem. A project has been defined and evaluation tests implemented with the aims of automatically capturing bilingual terminology from existing technical texts and their translations, validating the candidate term pairs generated, defining an appropriate database structure and generating terminological records in an automatic or semi-automatic manner. Benefits of this approach are becoming apparent and, as work progresses, the potential for extending the scope of the project to other related applications offers interesting prospects for the future.
منابع مشابه
Mutual terminology extraction using a statistical framework
In this paper, we explore a statistical framework for mutual bilingual terminology extraction. We propose three probabilistic models to assess the proposition that automatic alignment can play an active role in bilingual terminology extraction and translate it into mutual bilingual terminology extraction. The results indicate that such models are valid and can show that mutual bilingual termino...
متن کاملMutual Bilingual Terminology Extraction
This paper describes a novel methodology to perform bilingual terminology extraction, in which automatic alignment is used to improve the performance of terminology extraction for each language. The strengths of monolingual terminology extraction for each language are exploited to improve the performance of terminology extraction in the other language, thanks to the availability of a sentence-l...
متن کاملUsing machine learning to perform automatic term recognition
In this paper a machine learning approach is applied to Automatic Term Recognition (ATR). Similar approaches have been successfully used in Automatic Keyword Extraction (AKE). Using a dataset consisting of Swedish patent texts and validated terms belonging to these texts, unigrams and bigrams are extracted and annotated with linguistic and statistical feature values. Experiments using a varying...
متن کاملAutomatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval
OBJECTIVES We present in this article experiments on multi-language information extraction and access in the medical domain. For such applications, multilingual terminology plays a crucial role when working on specialized languages and specific domains. MATERIAL AND METHODS We propose firstly a method for enriching multilingual thesauri which extracts new terms from parallel corpora, and seco...
متن کاملA Comparison of Unsupervised Bilingual Term Extraction Methods Using Phrase-Tables
Automatic bilingual term extraction is essential for providing a consistent bilingual term list for human translators engaged in translating a set of documents. We compare three statisticalmeasures for extracting bilingual terms from a phrase-table built from a parallel corpus. We show that these measures extract different bilingual term candidates and a combination of these measures ranks vali...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010